Overview

Dataset statistics

Number of variables15
Number of observations71111
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.1 MiB
Average record size in memory120.0 B

Variable types

Categorical5
Numeric10

Alerts

id has constant value "23" Constant
type_de_station has constant value "ISS" Constant
data has a high cardinality: 71111 distinct values High cardinality
heure_de_paris has a high cardinality: 71023 distinct values High cardinality
heure_utc has a high cardinality: 71023 distinct values High cardinality
humidite is highly correlated with temperature_en_degre_cHigh correlation
direction_du_vecteur_de_vent_max is highly correlated with direction_du_vecteur_de_vent_max_en_degresHigh correlation
pluie_intensite_max is highly correlated with pluieHigh correlation
pluie is highly correlated with pluie_intensite_maxHigh correlation
direction_du_vecteur_de_vent_max_en_degres is highly correlated with direction_du_vecteur_de_vent_maxHigh correlation
force_moyenne_du_vecteur_vent is highly correlated with force_rafale_maxHigh correlation
force_rafale_max is highly correlated with force_moyenne_du_vecteur_ventHigh correlation
temperature_en_degre_c is highly correlated with humiditeHigh correlation
direction_du_vecteur_de_vent_max is highly correlated with direction_du_vecteur_de_vent_max_en_degresHigh correlation
direction_du_vecteur_de_vent_max_en_degres is highly correlated with direction_du_vecteur_de_vent_maxHigh correlation
force_moyenne_du_vecteur_vent is highly correlated with force_rafale_maxHigh correlation
force_rafale_max is highly correlated with force_moyenne_du_vecteur_ventHigh correlation
direction_du_vecteur_de_vent_max is highly correlated with direction_du_vecteur_de_vent_max_en_degresHigh correlation
pluie_intensite_max is highly correlated with pluieHigh correlation
pluie is highly correlated with pluie_intensite_maxHigh correlation
direction_du_vecteur_de_vent_max_en_degres is highly correlated with direction_du_vecteur_de_vent_maxHigh correlation
force_moyenne_du_vecteur_vent is highly correlated with force_rafale_maxHigh correlation
force_rafale_max is highly correlated with force_moyenne_du_vecteur_ventHigh correlation
type_de_station is highly correlated with idHigh correlation
id is highly correlated with type_de_stationHigh correlation
humidite is highly correlated with temperature_en_degre_cHigh correlation
direction_du_vecteur_de_vent_max is highly correlated with direction_du_vecteur_vent_moyen and 3 other fieldsHigh correlation
direction_du_vecteur_vent_moyen is highly correlated with direction_du_vecteur_de_vent_max and 1 other fieldsHigh correlation
direction_du_vecteur_de_vent_max_en_degres is highly correlated with direction_du_vecteur_de_vent_max and 3 other fieldsHigh correlation
force_moyenne_du_vecteur_vent is highly correlated with direction_du_vecteur_de_vent_max and 2 other fieldsHigh correlation
force_rafale_max is highly correlated with direction_du_vecteur_de_vent_max and 2 other fieldsHigh correlation
temperature_en_degre_c is highly correlated with humiditeHigh correlation
data is uniformly distributed Uniform
heure_de_paris is uniformly distributed Uniform
heure_utc is uniformly distributed Uniform
data has unique values Unique
direction_du_vecteur_de_vent_max has 12389 (17.4%) zeros Zeros
pluie_intensite_max has 68056 (95.7%) zeros Zeros
direction_du_vecteur_vent_moyen has 36435 (51.2%) zeros Zeros
pluie has 70139 (98.6%) zeros Zeros
direction_du_vecteur_de_vent_max_en_degres has 12389 (17.4%) zeros Zeros
force_moyenne_du_vecteur_vent has 14033 (19.7%) zeros Zeros
force_rafale_max has 8431 (11.9%) zeros Zeros

Reproduction

Analysis started2022-07-12 22:55:39.013480
Analysis finished2022-07-12 22:56:16.509295
Duration37.5 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

data
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct71111
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size555.7 KiB
2f84156dda580000e8744800
 
1
2f0b22315a3009406bda2c00
 
1
2f0c22f15a3000000b800000
 
1
2f0b28f41d700a802b9e0c00
 
1
2f0b245215f807e04bd62c00
 
1
Other values (71106)
71106 

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters1706664
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71111 ?
Unique (%)100.0%

Sample

1st row2f84156dda580000e8744800
2nd row2f87146d2680000049b21800
3rd row2f8715cd96880000e9903840
4th row2f87166dca880000c9702c40
5th row2f87168dd2880000a96e3840

Common Values

ValueCountFrequency (%)
2f84156dda580000e87448001
 
< 0.1%
2f0b22315a3009406bda2c001
 
< 0.1%
2f0c22f15a3000000b8000001
 
< 0.1%
2f0b28f41d700a802b9e0c001
 
< 0.1%
2f0b245215f807e04bd62c001
 
< 0.1%
2f0d23b15a4008f04c7c18001
 
< 0.1%
2f0b22914e3009806bda34001
 
< 0.1%
2f0b2251563009406bd82c001
 
< 0.1%
2f0b22115e3008204bb834001
 
< 0.1%
2f0a28b3998009a02b9c18001
 
< 0.1%
Other values (71101)71101
> 99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2f84156dda580000e87448001
 
< 0.1%
2f8718ce5e38000109544c001
 
< 0.1%
2f87166dca880000c9702c401
 
< 0.1%
2f87168dd2880000a96e38401
 
< 0.1%
2f8716cde2880000e95048401
 
< 0.1%
2f87174e42800001495454001
 
< 0.1%
2f87176e46780001895678001
 
< 0.1%
2f87110d666800004a1618001
 
< 0.1%
2f871a0e16580001895474001
 
< 0.1%
2f8711ed1e78000029f40c001
 
< 0.1%
Other values (71101)71101
> 99.9%

Most occurring characters

ValueCountFrequency (%)
0508667
29.8%
2169619
 
9.9%
8124530
 
7.3%
1109110
 
6.4%
e97689
 
5.7%
493019
 
5.5%
c76646
 
4.5%
676097
 
4.5%
f71805
 
4.2%
366677
 
3.9%
Other values (6)312805
18.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1291386
75.7%
Lowercase Letter415278
 
24.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0508667
39.4%
2169619
 
13.1%
8124530
 
9.6%
1109110
 
8.4%
493019
 
7.2%
676097
 
5.9%
366677
 
5.2%
557602
 
4.5%
947825
 
3.7%
738240
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
e97689
23.5%
c76646
18.5%
f71805
17.3%
a64114
15.4%
b61377
14.8%
d43647
10.5%

Most occurring scripts

ValueCountFrequency (%)
Common1291386
75.7%
Latin415278
 
24.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0508667
39.4%
2169619
 
13.1%
8124530
 
9.6%
1109110
 
8.4%
493019
 
7.2%
676097
 
5.9%
366677
 
5.2%
557602
 
4.5%
947825
 
3.7%
738240
 
3.0%
Latin
ValueCountFrequency (%)
e97689
23.5%
c76646
18.5%
f71805
17.3%
a64114
15.4%
b61377
14.8%
d43647
10.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1706664
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0508667
29.8%
2169619
 
9.9%
8124530
 
7.3%
1109110
 
6.4%
e97689
 
5.7%
493019
 
5.5%
c76646
 
4.5%
676097
 
4.5%
f71805
 
4.2%
366677
 
3.9%
Other values (6)312805
18.3%

id
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size555.7 KiB
23
71111 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters142222
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row23
2nd row23
3rd row23
4th row23
5th row23

Common Values

ValueCountFrequency (%)
2371111
100.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
2371111
100.0%

Most occurring characters

ValueCountFrequency (%)
271111
50.0%
371111
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number142222
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
271111
50.0%
371111
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common142222
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
271111
50.0%
371111
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII142222
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
271111
50.0%
371111
50.0%

humidite
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct89
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.56715557
Minimum0
Maximum97
Zeros185
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile37
Q158
median69
Q375
95-th percentile84
Maximum97
Range97
Interquartile range (IQR)17

Descriptive statistics

Standard deviation14.45083463
Coefficient of variation (CV)0.2203974612
Kurtosis1.199640646
Mean65.56715557
Median Absolute Deviation (MAD)8
Skewness-0.9570391631
Sum4662546
Variance208.8266214
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
742954
 
4.2%
732897
 
4.1%
762884
 
4.1%
722732
 
3.8%
752697
 
3.8%
712540
 
3.6%
702316
 
3.3%
692157
 
3.0%
772068
 
2.9%
782026
 
2.8%
Other values (79)45840
64.5%
ValueCountFrequency (%)
0185
0.3%
103
 
< 0.1%
119
 
< 0.1%
1215
 
< 0.1%
1315
 
< 0.1%
146
 
< 0.1%
1525
 
< 0.1%
1625
 
< 0.1%
1729
 
< 0.1%
1831
 
< 0.1%
ValueCountFrequency (%)
9719
 
< 0.1%
9666
 
0.1%
95118
 
0.2%
94290
0.4%
93293
0.4%
92237
0.3%
91181
0.3%
90186
0.3%
89260
0.4%
88370
0.5%

direction_du_vecteur_de_vent_max
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.870343547
Minimum0
Maximum15
Zeros12389
Zeros (%)17.4%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q14
median9
Q312
95-th percentile14
Maximum15
Range15
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.81699192
Coefficient of variation (CV)0.6120434122
Kurtosis-1.102491473
Mean7.870343547
Median Absolute Deviation (MAD)3
Skewness-0.4246635816
Sum559668
Variance23.20341116
MonotonicityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
012389
17.4%
119507
13.4%
128795
12.4%
76249
8.8%
85254
7.4%
134602
 
6.5%
144019
 
5.7%
103768
 
5.3%
53508
 
4.9%
153019
 
4.2%
Other values (6)10001
14.1%
ValueCountFrequency (%)
012389
17.4%
1887
 
1.2%
2836
 
1.2%
32038
 
2.9%
42305
 
3.2%
53508
 
4.9%
62043
 
2.9%
76249
8.8%
85254
7.4%
91892
 
2.7%
ValueCountFrequency (%)
153019
 
4.2%
144019
5.7%
134602
6.5%
128795
12.4%
119507
13.4%
103768
 
5.3%
91892
 
2.7%
85254
7.4%
76249
8.8%
62043
 
2.9%

pluie_intensite_max
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.009244701945
Minimum0
Maximum2.4
Zeros68056
Zeros (%)95.7%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum2.4
Range2.4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.04821705555
Coefficient of variation (CV)5.215641979
Kurtosis280.4643935
Mean0.009244701945
Median Absolute Deviation (MAD)0
Skewness10.39433843
Sum657.4
Variance0.002324884446
MonotonicityNot monotonic
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
068056
95.7%
0.22914
 
4.1%
0.4106
 
0.1%
0.617
 
< 0.1%
0.87
 
< 0.1%
1.23
 
< 0.1%
13
 
< 0.1%
2.22
 
< 0.1%
1.41
 
< 0.1%
2.41
 
< 0.1%
ValueCountFrequency (%)
068056
95.7%
0.22914
 
4.1%
0.4106
 
0.1%
0.617
 
< 0.1%
0.87
 
< 0.1%
13
 
< 0.1%
1.23
 
< 0.1%
1.41
 
< 0.1%
1.61
 
< 0.1%
2.22
 
< 0.1%
ValueCountFrequency (%)
2.41
 
< 0.1%
2.22
 
< 0.1%
1.61
 
< 0.1%
1.41
 
< 0.1%
1.23
 
< 0.1%
13
 
< 0.1%
0.87
 
< 0.1%
0.617
 
< 0.1%
0.4106
 
0.1%
0.22914
4.1%

pression
Real number (ℝ≥0)

Distinct54
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99292.40764
Minimum90000
Maximum101300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum90000
5-th percentile98100
Q199000
median99400
Q399800
95-th percentile100400
Maximum101300
Range11300
Interquartile range (IQR)800

Descriptive statistics

Standard deviation1004.003173
Coefficient of variation (CV)0.0101115805
Kurtosis44.27355184
Mean99292.40764
Median Absolute Deviation (MAD)400
Skewness-5.082383375
Sum7060782400
Variance1008022.372
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
994005122
 
7.2%
993004581
 
6.4%
995004565
 
6.4%
992004432
 
6.2%
996004286
 
6.0%
991004171
 
5.9%
990004047
 
5.7%
997003956
 
5.6%
989003244
 
4.6%
998003199
 
4.5%
Other values (44)29508
41.5%
ValueCountFrequency (%)
90000450
0.6%
961002
 
< 0.1%
962002
 
< 0.1%
963003
 
< 0.1%
9640021
 
< 0.1%
9650018
 
< 0.1%
9660063
 
0.1%
9670077
 
0.1%
9680087
 
0.1%
9690079
 
0.1%
ValueCountFrequency (%)
1013006
 
< 0.1%
10120056
 
0.1%
101100124
 
0.2%
101000117
 
0.2%
100900318
 
0.4%
100800487
 
0.7%
100700589
0.8%
100600712
1.0%
100500840
1.2%
1004001357
1.9%

direction_du_vecteur_vent_moyen
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct181
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.08005794
Minimum0
Maximum360
Zeros36435
Zeros (%)51.2%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3146
95-th percentile294
Maximum360
Range360
Interquartile range (IQR)146

Descriptive statistics

Standard deviation105.2144762
Coefficient of variation (CV)1.266422759
Kurtosis-0.3785593856
Mean83.08005794
Median Absolute Deviation (MAD)0
Skewness0.9639341518
Sum5907906
Variance11070.086
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
036435
51.2%
134923
 
1.3%
156842
 
1.2%
112748
 
1.1%
90672
 
0.9%
246487
 
0.7%
92486
 
0.7%
94471
 
0.7%
66462
 
0.6%
96438
 
0.6%
Other values (171)29147
41.0%
ValueCountFrequency (%)
036435
51.2%
2148
 
0.2%
4150
 
0.2%
6127
 
0.2%
8111
 
0.2%
1099
 
0.1%
1292
 
0.1%
1494
 
0.1%
1667
 
0.1%
1888
 
0.1%
ValueCountFrequency (%)
3603
 
< 0.1%
35858
0.1%
35667
0.1%
35484
0.1%
35287
0.1%
35081
0.1%
34898
0.1%
34686
0.1%
34490
0.1%
34274
0.1%

type_de_station
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size555.7 KiB
ISS
71111 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters213333
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowISS
2nd rowISS
3rd rowISS
4th rowISS
5th rowISS

Common Values

ValueCountFrequency (%)
ISS71111
100.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
iss71111
100.0%

Most occurring characters

ValueCountFrequency (%)
S142222
66.7%
I71111
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter213333
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S142222
66.7%
I71111
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin213333
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S142222
66.7%
I71111
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII213333
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S142222
66.7%
I71111
33.3%

pluie
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.005914696742
Minimum0
Maximum3.2
Zeros70139
Zeros (%)98.6%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum3.2
Range3.2
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.06585891503
Coefficient of variation (CV)11.1347915
Kurtosis421.9369147
Mean0.005914696742
Median Absolute Deviation (MAD)0
Skewness17.56521456
Sum420.6
Variance0.004337396689
MonotonicityNot monotonic
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
070139
98.6%
0.2507
 
0.7%
0.4205
 
0.3%
0.6102
 
0.1%
0.859
 
0.1%
137
 
0.1%
1.231
 
< 0.1%
1.412
 
< 0.1%
1.86
 
< 0.1%
25
 
< 0.1%
Other values (4)8
 
< 0.1%
ValueCountFrequency (%)
070139
98.6%
0.2507
 
0.7%
0.4205
 
0.3%
0.6102
 
0.1%
0.859
 
0.1%
137
 
0.1%
1.231
 
< 0.1%
1.412
 
< 0.1%
1.64
 
< 0.1%
1.86
 
< 0.1%
ValueCountFrequency (%)
3.21
 
< 0.1%
2.61
 
< 0.1%
2.42
 
< 0.1%
25
 
< 0.1%
1.86
 
< 0.1%
1.64
 
< 0.1%
1.412
 
< 0.1%
1.231
< 0.1%
137
0.1%
0.859
0.1%

direction_du_vecteur_de_vent_max_en_degres
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean177.0827298
Minimum0
Maximum337.5
Zeros12389
Zeros (%)17.4%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q190
median202.5
Q3270
95-th percentile315
Maximum337.5
Range337.5
Interquartile range (IQR)180

Descriptive statistics

Standard deviation108.3823182
Coefficient of variation (CV)0.6120434122
Kurtosis-1.102491473
Mean177.0827298
Median Absolute Deviation (MAD)67.5
Skewness-0.4246635816
Sum12592530
Variance11746.7269
MonotonicityNot monotonic
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
012389
17.4%
247.59507
13.4%
2708795
12.4%
157.56249
8.8%
1805254
7.4%
292.54602
 
6.5%
3154019
 
5.7%
2253768
 
5.3%
112.53508
 
4.9%
337.53019
 
4.2%
Other values (6)10001
14.1%
ValueCountFrequency (%)
012389
17.4%
22.5887
 
1.2%
45836
 
1.2%
67.52038
 
2.9%
902305
 
3.2%
112.53508
 
4.9%
1352043
 
2.9%
157.56249
8.8%
1805254
7.4%
202.51892
 
2.7%
ValueCountFrequency (%)
337.53019
 
4.2%
3154019
5.7%
292.54602
6.5%
2708795
12.4%
247.59507
13.4%
2253768
 
5.3%
202.51892
 
2.7%
1805254
7.4%
157.56249
8.8%
1352043
 
2.9%

force_moyenne_du_vecteur_vent
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.876531057
Minimum0
Maximum30
Zeros14033
Zeros (%)19.7%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile12
Maximum30
Range30
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.821528148
Coefficient of variation (CV)0.9858113069
Kurtosis2.236900452
Mean3.876531057
Median Absolute Deviation (MAD)2
Skewness1.403283341
Sum275664
Variance14.60407738
MonotonicityNot monotonic
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
014033
19.7%
29206
12.9%
18871
12.5%
38306
11.7%
46826
9.6%
55324
 
7.5%
64251
 
6.0%
73205
 
4.5%
82537
 
3.6%
91951
 
2.7%
Other values (20)6601
9.3%
ValueCountFrequency (%)
014033
19.7%
18871
12.5%
29206
12.9%
38306
11.7%
46826
9.6%
55324
 
7.5%
64251
 
6.0%
73205
 
4.5%
82537
 
3.6%
91951
 
2.7%
ValueCountFrequency (%)
301
 
< 0.1%
281
 
< 0.1%
273
 
< 0.1%
264
 
< 0.1%
256
 
< 0.1%
2413
 
< 0.1%
2318
 
< 0.1%
2227
 
< 0.1%
2158
0.1%
2077
0.1%

force_rafale_max
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct41
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.80654189
Minimum0
Maximum70
Zeros8431
Zeros (%)11.9%
Negative0
Negative (%)0.0%
Memory size555.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q15
median11
Q318
95-th percentile27
Maximum70
Range70
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.960648981
Coefficient of variation (CV)0.7589562692
Kurtosis0.965983932
Mean11.80654189
Median Absolute Deviation (MAD)6
Skewness0.8444703712
Sum839575
Variance80.29323016
MonotonicityNot monotonic
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
08431
 
11.9%
65509
 
7.7%
54920
 
6.9%
134818
 
6.8%
144805
 
6.8%
34642
 
6.5%
114445
 
6.3%
164393
 
6.2%
104154
 
5.8%
183782
 
5.3%
Other values (31)21212
29.8%
ValueCountFrequency (%)
08431
11.9%
22533
 
3.6%
34642
6.5%
54920
6.9%
65509
7.7%
83668
5.2%
104154
5.8%
114445
6.3%
134818
6.8%
144805
6.8%
ValueCountFrequency (%)
701
 
< 0.1%
667
 
< 0.1%
623
 
< 0.1%
611
 
< 0.1%
586
 
< 0.1%
567
 
< 0.1%
549
 
< 0.1%
5318
< 0.1%
5121
< 0.1%
5040
0.1%

temperature_en_degre_c
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION

Distinct426
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.98287044
Minimum-50
Maximum37.2
Zeros45
Zeros (%)0.1%
Negative1521
Negative (%)2.1%
Memory size555.7 KiB

Quantile statistics

Minimum-50
5-th percentile2.2
Q19
median13.7
Q319
95-th percentile27.1
Maximum37.2
Range87.2
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.01952736
Coefficient of variation (CV)0.5735251137
Kurtosis9.35411793
Mean13.98287044
Median Absolute Deviation (MAD)5
Skewness-1.087448638
Sum994335.9
Variance64.31281908
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.4460
 
0.6%
13.3453
 
0.6%
12.5445
 
0.6%
13.7443
 
0.6%
13.4436
 
0.6%
13.5424
 
0.6%
12.6423
 
0.6%
10.8419
 
0.6%
10.7417
 
0.6%
11.3416
 
0.6%
Other values (416)66775
93.9%
ValueCountFrequency (%)
-50181
0.3%
-6.93
 
< 0.1%
-6.82
 
< 0.1%
-6.74
 
< 0.1%
-6.63
 
< 0.1%
-6.53
 
< 0.1%
-5.98
 
< 0.1%
-5.83
 
< 0.1%
-5.71
 
< 0.1%
-5.61
 
< 0.1%
ValueCountFrequency (%)
37.22
 
< 0.1%
37.13
 
< 0.1%
377
< 0.1%
36.94
< 0.1%
36.88
< 0.1%
36.76
< 0.1%
36.63
 
< 0.1%
36.55
< 0.1%
36.47
< 0.1%
36.39
< 0.1%

heure_de_paris
Categorical

HIGH CARDINALITY
UNIFORM

Distinct71023
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size555.7 KiB
2021-11-26T22:45:00+01:00
 
2
2022-07-09T23:45:00+02:00
 
2
2022-05-04T21:30:00+02:00
 
2
2021-10-31T13:00:00+01:00
 
2
2021-07-29T17:00:00+02:00
 
2
Other values (71018)
71101 

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters1777775
Distinct characters14
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70935 ?
Unique (%)99.8%

Sample

1st row2020-12-04T11:30:00+01:00
2nd row2020-12-07T09:30:00+01:00
3rd row2020-12-07T12:15:00+01:00
4th row2020-12-07T13:30:00+01:00
5th row2020-12-07T13:45:00+01:00

Common Values

ValueCountFrequency (%)
2021-11-26T22:45:00+01:002
 
< 0.1%
2022-07-09T23:45:00+02:002
 
< 0.1%
2022-05-04T21:30:00+02:002
 
< 0.1%
2021-10-31T13:00:00+01:002
 
< 0.1%
2021-07-29T17:00:00+02:002
 
< 0.1%
2022-03-26T23:45:00+01:002
 
< 0.1%
2022-02-17T12:45:00+01:002
 
< 0.1%
2022-05-21T23:00:00+02:002
 
< 0.1%
2022-05-14T23:00:00+02:002
 
< 0.1%
2022-04-18T02:45:00+02:002
 
< 0.1%
Other values (71013)71091
> 99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2021-11-26t22:45:00+01:002
 
< 0.1%
2021-12-13t07:15:00+01:002
 
< 0.1%
2022-03-19t17:15:00+01:002
 
< 0.1%
2022-02-26t08:30:00+01:002
 
< 0.1%
2021-09-18t20:45:00+02:002
 
< 0.1%
2021-11-08t03:00:00+01:002
 
< 0.1%
2022-07-02t17:00:00+02:002
 
< 0.1%
2021-10-31t01:45:00+02:002
 
< 0.1%
2022-02-23t08:00:00+01:002
 
< 0.1%
2021-10-31t13:30:00+01:002
 
< 0.1%
Other values (71013)71091
> 99.9%

Most occurring characters

ValueCountFrequency (%)
0637631
35.9%
2251266
 
14.1%
:213333
 
12.0%
1171149
 
9.6%
-142222
 
8.0%
T71111
 
4.0%
+71111
 
4.0%
553500
 
3.0%
342736
 
2.4%
436020
 
2.0%
Other values (4)87696
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1279998
72.0%
Other Punctuation213333
 
12.0%
Dash Punctuation142222
 
8.0%
Uppercase Letter71111
 
4.0%
Math Symbol71111
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0637631
49.8%
2251266
 
19.6%
1171149
 
13.4%
553500
 
4.2%
342736
 
3.3%
436020
 
2.8%
931686
 
2.5%
820164
 
1.6%
720142
 
1.6%
615704
 
1.2%
Other Punctuation
ValueCountFrequency (%)
:213333
100.0%
Dash Punctuation
ValueCountFrequency (%)
-142222
100.0%
Uppercase Letter
ValueCountFrequency (%)
T71111
100.0%
Math Symbol
ValueCountFrequency (%)
+71111
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1706664
96.0%
Latin71111
 
4.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0637631
37.4%
2251266
 
14.7%
:213333
 
12.5%
1171149
 
10.0%
-142222
 
8.3%
+71111
 
4.2%
553500
 
3.1%
342736
 
2.5%
436020
 
2.1%
931686
 
1.9%
Other values (3)56010
 
3.3%
Latin
ValueCountFrequency (%)
T71111
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1777775
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0637631
35.9%
2251266
 
14.1%
:213333
 
12.0%
1171149
 
9.6%
-142222
 
8.0%
T71111
 
4.0%
+71111
 
4.0%
553500
 
3.0%
342736
 
2.4%
436020
 
2.0%
Other values (4)87696
 
4.9%

heure_utc
Categorical

HIGH CARDINALITY
UNIFORM

Distinct71023
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size555.7 KiB
2021-11-26T21:45:00+00:00
 
2
2022-07-09T21:45:00+00:00
 
2
2022-05-04T19:30:00+00:00
 
2
2021-10-31T12:00:00+00:00
 
2
2021-07-29T15:00:00+00:00
 
2
Other values (71018)
71101 

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters1777775
Distinct characters14
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70935 ?
Unique (%)99.8%

Sample

1st row2020-12-04T10:30:00+00:00
2nd row2020-12-07T08:30:00+00:00
3rd row2020-12-07T11:15:00+00:00
4th row2020-12-07T12:30:00+00:00
5th row2020-12-07T12:45:00+00:00

Common Values

ValueCountFrequency (%)
2021-11-26T21:45:00+00:002
 
< 0.1%
2022-07-09T21:45:00+00:002
 
< 0.1%
2022-05-04T19:30:00+00:002
 
< 0.1%
2021-10-31T12:00:00+00:002
 
< 0.1%
2021-07-29T15:00:00+00:002
 
< 0.1%
2022-03-26T22:45:00+00:002
 
< 0.1%
2022-02-17T11:45:00+00:002
 
< 0.1%
2022-05-21T21:00:00+00:002
 
< 0.1%
2022-05-14T21:00:00+00:002
 
< 0.1%
2022-04-18T00:45:00+00:002
 
< 0.1%
Other values (71013)71091
> 99.9%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2021-11-26t21:45:00+00:002
 
< 0.1%
2021-12-13t06:15:00+00:002
 
< 0.1%
2022-03-19t16:15:00+00:002
 
< 0.1%
2022-02-26t07:30:00+00:002
 
< 0.1%
2021-09-18t18:45:00+00:002
 
< 0.1%
2021-11-08t02:00:00+00:002
 
< 0.1%
2022-07-02t15:00:00+00:002
 
< 0.1%
2021-10-30t23:45:00+00:002
 
< 0.1%
2022-02-23t07:00:00+00:002
 
< 0.1%
2021-10-31t12:30:00+00:002
 
< 0.1%
Other values (71013)71091
> 99.9%

Most occurring characters

ValueCountFrequency (%)
0708784
39.9%
:213333
 
12.0%
2207339
 
11.7%
1143733
 
8.1%
-142222
 
8.0%
T71111
 
4.0%
+71111
 
4.0%
553535
 
3.0%
342405
 
2.4%
435986
 
2.0%
Other values (4)88216
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1279998
72.0%
Other Punctuation213333
 
12.0%
Dash Punctuation142222
 
8.0%
Uppercase Letter71111
 
4.0%
Math Symbol71111
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0708784
55.4%
2207339
 
16.2%
1143733
 
11.2%
553535
 
4.2%
342405
 
3.3%
435986
 
2.8%
931705
 
2.5%
720436
 
1.6%
820232
 
1.6%
615843
 
1.2%
Other Punctuation
ValueCountFrequency (%)
:213333
100.0%
Dash Punctuation
ValueCountFrequency (%)
-142222
100.0%
Uppercase Letter
ValueCountFrequency (%)
T71111
100.0%
Math Symbol
ValueCountFrequency (%)
+71111
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1706664
96.0%
Latin71111
 
4.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0708784
41.5%
:213333
 
12.5%
2207339
 
12.1%
1143733
 
8.4%
-142222
 
8.3%
+71111
 
4.2%
553535
 
3.1%
342405
 
2.5%
435986
 
2.1%
931705
 
1.9%
Other values (3)56511
 
3.3%
Latin
ValueCountFrequency (%)
T71111
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1777775
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0708784
39.9%
:213333
 
12.0%
2207339
 
11.7%
1143733
 
8.1%
-142222
 
8.0%
T71111
 
4.0%
+71111
 
4.0%
553535
 
3.0%
342405
 
2.4%
435986
 
2.0%
Other values (4)88216
 
5.0%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

dataidhumiditedirection_du_vecteur_de_vent_maxpluie_intensite_maxpressiondirection_du_vecteur_vent_moyentype_de_stationpluiedirection_du_vecteur_de_vent_max_en_degresforce_moyenne_du_vecteur_ventforce_rafale_maxtemperature_en_degre_cheure_de_parisheure_utc
02f84156dda580000e87448002375100.0967000ISS0.0225.07185.62020-12-04T11:30:00+01:002020-12-04T10:30:00+00:00
12f87146d2680000049b21800238090.0977000ISS0.0202.5262.92020-12-07T09:30:00+01:002020-12-07T08:30:00+00:00
22f8715cd96880000e9903840238180.2976000ISS0.0180.07144.52020-12-07T12:15:00+01:002020-12-07T11:15:00+00:00
32f87166dca880000c9702c40238180.2975000ISS0.0180.06115.22020-12-07T13:30:00+01:002020-12-07T12:30:00+00:00
42f87168dd2880000a96e3840238170.2975000ISS0.0157.55145.42020-12-07T13:45:00+01:002020-12-07T12:45:00+00:00
52f8716cde2880000e9504840238180.2974000ISS0.0180.07185.82020-12-07T14:15:00+01:002020-12-07T13:15:00+00:00
62f87174e42800001495454002380100.0974000ISS0.0225.010217.02020-12-07T15:15:00+01:002020-12-07T14:15:00+00:00
72f87176e46780001895678002379110.0974000ISS0.0247.512307.12020-12-07T15:30:00+01:002020-12-07T14:30:00+00:00
82f8718ce5e38000109544c002371100.0974000ISS0.0225.08197.72020-12-07T18:15:00+01:002020-12-07T17:15:00+00:00
92f87198e22600001297458402376100.2975000ISS0.0225.09226.82020-12-07T19:45:00+01:002020-12-07T18:45:00+00:00

Last rows

dataidhumiditedirection_du_vecteur_de_vent_maxpluie_intensite_maxpressiondirection_du_vecteur_vent_moyentype_de_stationpluiedirection_du_vecteur_de_vent_max_en_degresforce_moyenne_du_vecteur_ventforce_rafale_maxtemperature_en_degre_cheure_de_parisheure_utc
711012ee83b914db809800c9a0c002355130.0100000304ISS0.0292.50319.32022-07-09T00:45:00+02:002022-07-08T22:45:00+00:00
711022ee83350a21808f08cf640002367110.0100300286ISS0.0247.541616.82022-07-08T08:15:00+02:002022-07-08T06:15:00+00:00
711032ee833b0d60800010ce05800236500.01003000ISS0.00.082217.52022-07-08T09:00:00+02:002022-07-08T07:00:00+00:00
711042ee8343141f00000ece04c00236200.01003000ISS0.00.071919.02022-07-08T10:00:00+02:002022-07-08T08:00:00+00:00
711052ee8345159e80000ecc04c00236100.01002000ISS0.00.071919.62022-07-08T10:15:00+02:002022-07-08T08:15:00+00:00
711062ee8355249980860acd84c002351120.0100200268ISS0.0270.051923.22022-07-08T12:15:00+02:002022-07-08T10:15:00+00:00
711072ee83592558827d10cd858402349120.2100200250ISS0.4270.082223.52022-07-08T12:45:00+02:002022-07-08T10:45:00+00:00
711082eec39b3c0f800000bc00800233100.0994000ISS0.00.00229.02022-07-12T21:00:00+02:002022-07-12T19:00:00+00:00
711092eec39d3591850000bc00040233500.2994000ISS1.00.00027.62022-07-12T21:15:00+02:002022-07-12T19:15:00+00:00
711102eec3ad21d5000000be01400234200.0995000ISS0.00.00522.72022-07-12T23:15:00+02:002022-07-12T21:15:00+00:00